Dylan Rohan - a1844790


Week 1 - Assessment 1 - Part A

The dataset I have selected contains 13 attributes of 178 wines and appears free of missing data. It is usually used as a practice dataset for a classifier model, but I will be using it to compare the populations of the three classes of wine on a given attribute. The class indicate the cultivator of the wine, but all three come from the same region in Italy.

Section 1 - Violin plots

Below I explore the data using multiple violin plots and do a bit of statistical analyses to determine whether any of the attributes are significanlty different among the three classes using an anova. It was found that the three populations of wine were statistically significant below the 0.05 threshold in all attributes. This means that depending on which attribute you find most important in an alcoholic drink, one of these cultivators of wine would be better suited to your tastes than the others.

(The default hover seems to supply all the information one might want, but it can be altered with the hovertemplate attribute)

Section 2 - Seaborn horizontal boxplot